Automatic speech recognition using dynamic bayesian networks with both acoustic and articulatory variables

نویسندگان

  • Todd A. Stephenson
  • Hervé Bourlard
  • Samy Bengio
  • Andrew C. Morris
چکیده

Current technology for automatic speech recognition (ASR) uses hidden Markov models (HMMs) that recognize spoken speech using the acoustic signal. However, no use is made of the causes of the acoustic signal: the articulators. We present here a dynamic Bayesian network (DBN) model that utilizes an additional variable for representing the state of the articulators. A particular strength of the system is that, while it uses measured articulatory data during its training, it does not need to know these values during recognition. As Bayesian networks are not used often in the speech community, we give an introduction to them. After describing how they can be used in ASR, we present a system to do isolated word recognition using articulatory information. Recognition results are given, showing that a system with both acoustics and inferred articulatory positions performs better than a system with only acoustics.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Hidden feature models for speech recognition using dynamic Bayesian networks

In this paper, we investigate the use of dynamic Bayesian networks (DBNs) to explicitly represent models of hidden features, such as articulatory or other phonological features, for automatic speech recognition. In previous work using the idea of hidden features, the representation has typically been implicit, relying on a single hidden state to represent a combination of features. We present a...

متن کامل

Integration of articulatory dynamic parameters in HMM/BN based speech recognition system

In this paper, we describe several approaches to integration of the articulatory dynamic parameters along with articulatory position data into a HMM/BN model based automatic speech recognition system. This work is a continuation of our previous study, where we have successfully combined speech acoustic features in form of MFCC with articulatory position observations. Articulatory dynamic parame...

متن کامل

Production Knowledge in the Recognition of Dysarthric Speech

Production knowledge in the recognition of dysarthric speech Frank Rudzicz Doctor of Philosophy Graduate Department of Department of Computer Science University of Toronto 2011 Millions of individuals have acquired or have been born with neuro-motor conditions that limit the control of their muscles, including those that manipulate the articulators of the vocal tract. These conditions, collecti...

متن کامل

Speech Recognition with Dynamic Bayesian Networks

Dynamic Bayesian networks (DBNs) are a useful tool for representing complex stochastic processes. Recent developments in inference and learning in DBNs allow their use in real-world applications. In this paper, we apply DBNs to the problem of speech recognition. The factored state representation enabled by DBNs allows us to explicitly represent long-term articulatory and acoustic context in add...

متن کامل

Integration of articulatory and spectrum features based on the hybrid HMM/BN modeling framework

Most of the current state-of-the-art speech recognition systems are based on speech signal parametrizations that crudely model the behavior of the human auditory system. However, little or no use is usually made of the knowledge on the human speech production system. A data-driven statistical approach to incorporate this knowledge into ASR would require a substantial amount of data, which are n...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000